Improving Speaker Recognition by Training on Emotion-Added Models
نویسندگان
چکیده
In speaker recognition applications, the changes of emotional states are main causes of errors. The ongoing work described in this contribution attempts to enhance the performance of automatic speaker recognition (ASR) systems on emotional speech. Two procedures that only need a small quantity of affective training data are applied to ASR task, which is very practical in real-world situations. The method includes classifying the emotional states by acoustical features and generating emotion-added model based on the emotion grouping. Experimental works are performed on Emotional Prosody Speech (EPS) corpus and show significant improvement in EERs and IRs compared with baseline and comparative experiments.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملModeling Emotion Expression and Perception Behavior in Auditive Emotion Evaluation
In this paper, we consider both speaker dependent and listener dependent aspects in the assessment of emotions in speech. We model the speaker dependencies in emotional speech production by two parameters which describe the individual’s emotional expression behavior. Similarly, we model the listener’s emotion perception behavior by a simple parametric model. These models form a basis for improv...
متن کاملSpeaker Clustering in Emotion Recognition
Speaker variability is a known challenge for emotion recognition, however little work has been done on speaker similarity in terms of its contribution to the performance in the emotion classification task. In this paper, we investigate this topic, and find a clear link between speaker proximity and the recognition accuracy. Motivated by this result, emotion based speaker clustering is proposed ...
متن کاملApplying pitch-dependent difference detection and modification to emotional speaker recognition
Emotion is an internal source, which can cause the speaker recognition system performance degradation by inducing extra intra-speaker vocal variability. Several enhancements have been applied to speaker recognition system under emotional speech. However, these methods suffer from the limitation of requiring the emotional speech in training or the emotion state of the speaker in testing. This pa...
متن کامل